Application of Control Charts in Pairs Trading: An Analysis of BIST30 Stock Indices

Project code, data results and data figures are below; discussion and conclusion are after these.

We get all data from excels and create a column for each stock.

Calculation of correlation for first 3600 inputs, that is 1.5 year between 2018-January and 2019-July, and two of the most correlated stock pairs founded. These are GARAN-AKBNK and SAHOL-VKBNK.

This linreg function provide train data and call test_linear function to test linear regression.

With symbols, we prepare data for buy and sell orders correctly by using two functions below. Our strategy is buying a stock if a residual is above UCL and selling if a residual is below LCL.

We call functions for GARAN-AKBNK and test the trading strategy for the interval between 2019-July and 2020-July.

Linear Regression Model for GARAN-AKBNK with using first 3600 input as train data.

We use 1-sigma for limits and the graph is resulted like that with limits for train data. It is applied in linreg function.

These are our orders and profits, GARAN creates 20% profit in 1 year and AKBKN creates a 1% loss in 1 year.

These graph is belong to test data, between 2019-July and 2020-July, and UCL and LCL are seen here.

Linear Regression Model for SAHOL-VAKBN with using first 3600 input as train data.

Linear Regression Model for SAHOL-VAKBN with using first 3600 input as train data.

We use 1-sigma for limits and the graph is resulted like that with limits for train data. It is applied in linreg function.

These are our orders and profits, SAHOL creates 60% profit in 1 year and VAKBN creates a 10% loss in 1 year.

These graph is belong to test data, between 2019-July and 2020-July, and UCL and LCL are seen here.

ARIMA Model for GARAN-AKBNK

We get GARAN/AKBNK parity because the movement of the parity will give us the current relation between stocks depending on the time.

Like linear regression model, we get first 3600 data as train data, that is 2018 January/2019 July; and we get data between 2019 July and 2020 July as test data.

Data is no stationary for Augmented Dickey-Fuller Test because p-value is 0.111.

ACF plot shows that data needs a first difference operation to get rid of the trend. After first difference, data becomes stationary.

Because acf lag-1 is negative, we add MA term to the model.

Everything is appropriate for residuals acf and pacf plot. Therefore, we can apply the arima model and get forecast data. There is a critical situation about the forecast that we get similar results after a little time if we do not update train data. Therefore, we update train data with test data is lastly used after every forecast.

AFTER THIS CODE, THERE IS A RED UNWANTED PART.

Firstly, we applied 1.5-sigma limits to the data, but we do not get enough buy and sell orders. Therefore, we decrease our limits to 1-sigma.

Below code represents our trading strategy which creates a buy order for GARAN and a sell order for AKBNK if the parity is above UCL, a buy order for AKBNK and a sell order for GARAN if the parity is below LCL.

We started with $100 for both GARAN and AKBNK. After applying advanced trading strategy, AKBNK creates 16% profit and GARAN creates 25% profit.

Below figure represents our test data, predictions and limits.

ARIMA Model for SAHOL-VAKBN

Data is no stationary for Augmented Dickey-Fuller Test because p-value is 0.469.

ACF plot shows that data needs a first difference operation to get rid of the trend. After first difference, data becomes stationary.

Because acf lag-1 is negative, we add MA term to the model.

Everything is appropriate for residuals acf and pacf plot. Therefore, we can apply the arima model and get forecast data. There is a critical situation about the forecast that we get similar results after a little time if we do not update train data. Therefore, we update train data with test data is lastly used after every forecast.

AFTER THIS CODE, THERE IS A RED UNWANTED PART.

Below code represents our trading strategy which creates a buy order for SAHOL and a sell order for VAKBN if the parity is above UCL, a buy order for VAKBN and a sell order for SAHOL if the parity is below LCL.

We started with $100 for both SAHOL and VAKBN. After applying advanced trading strategy, SAHOL creates 37% profit and SAHOL creates 1% loss.

Below figure represents our test data, predictions and limits.

DISCUSSION

In linear regression model, data between 2018 January and 2020 July is used. Data is divided as train data, from 2018 January to 2019 July, and test data, from 2019 July to 2020 July. Furthermore, the most correlated stocks are found for 2018 January/2019 July. Therefore, the model is not so successful in trading. In other words, it is assumed that we does not know anything about after 2019 July while creating trade simulation.

Simple trade simulation based on residuals between data predicted by linear regression model and real data. If residuals above UCL, we create a buy order for a stock and a sell order for another stock because they should be correlated for the model. We change orders residuals below LCL and try to catch cycles between stocks. However, the model does not catch that the correlation of stocks may decrease permanently; therefore, we may update model after every data comes true.

In ARIMA model, data between 2018 January and 2019 July is used as train data, and data between 2019 July and 2020 July is used as test data. Also, stock1/stock2 parity is created. Limits are created for 1-sigma rule and updated for each data is obtained from test data because time series should be adaptive for each new data. If the parity is above UCL or is below LCL, a buy order for a stock and a sell order for another stock are created until another out-of-limits situation.

CONCLUSION

Both methods do not create a great profit compared to normal cumulative return stocks for related dates. In stocks market, there are so external effects like political, sectoral and company related developments. Therefore, just looking correlation and relation between two data is wrong. However, our models can be better with different ordering buy-sell strategies and different control limit approaches like EWMA charts and so on.